Letter to the Editor: On the stability and ranking of predictors from random forest variable importance measures

نویسنده

  • Kristin K. Nicodemus
چکیده

A recent study examined the stability of rankings from random forests using two variable importance measures (mean decrease accuracy (MDA) and mean decrease Gini (MDG)) and concluded that rankings based on the MDG were more robust than MDA. However, studies examining data-specific characteristics on ranking stability have been few. Rankings based on the MDG measure showed sensitivity to within-predictor correlation and differences in category frequencies, even when the number of categories was held constant, and thus may produce spurious results. The MDA measure was robust to these data characteristics. Further, under strong within-predictor correlation, MDG rankings were less stable than those using MDA.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Letter to the Editor: Stability of Random Forest importance measures

The goal of this article (letter to the editor) is to emphasize the value of exploring ranking stability when using the importance measures, mean decrease accuracy (MDA) and mean decrease Gini (MDG), provided by Random Forest. We illustrate with a real and a simulated example that ranks based on the MDA are unstable to small perturbations of the dataset and ranks based on the MDG provide more r...

متن کامل

Psychological issues in children and youth during COVID-19 outbreak: A letter to Editor

Emerging COVID-19 disease has affected various aspects of people’s life. The psychological effects of this disease, especially in children and adolescents, include a wide range of disorders such as anxiety, depression, post-traumatic stress disorder, the possibility of domestic violence, addiction and substance use, dependence on cyberspace and related complications, change in patterns and dail...

متن کامل

Fast Unsupervised Automobile Insurance Fraud Detection Based on Spectral Ranking of Anomalies

Collecting insurance fraud samples is costly and if performed manually is very time consuming. This issue suggests usage of unsupervised models. One of the accurate methods in this regards is Spectral Ranking of Anomalies (SRA) that is shown to work better than other methods for auto insurance fraud detection specifically. However, this approach is not scalable to large samples and is not appro...

متن کامل

Fear of Medical Staff: The Importance of Stigmatization during the COVID-19 Pandemic: Letter to the Editor

Introduction: During the COVID-19 pandemic, although hospital staff cared for patients, they were recognized in the community as an asymptomatic carrier and people were afraid and anxious about them. To the extent thateven the families of hospital staff experienced this social stigma, and many people cut off contact with them. In addition to the stigma that medical staff received from people du...

متن کامل

Reliability analysis of rubble-mound breakwaters against the failure due to the armor layer instability based on the fuzzy random variables theory: A case study of Anzali Port breakwater

Breakwaters are among the most frequently-used coastal protective structures and their stability is vital to avoid turbulence at the ports. The main purpose of the present research is to use the theory of fuzzy random variables and the second-order reliability method (SORM) to study the reliability of a rubble-mound breakwater against the failure due to the armor layer instability. The limit-st...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Briefings in bioinformatics

دوره 12 4  شماره 

صفحات  -

تاریخ انتشار 2011